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Six Criteria for Survey Sample Design Evaluation 
ABSTRACT 

The popularity of sample survey in educational research makes it necessary for consumers to tell a 
good study from a poor one. Several sources were identified that gave advice on how to evaluate a 
sample design. The sources are either limited or too extensive to use in a practical sense. The purpose 
of this paper is to recommend six important vet practical criteria in evaluating the quality of a sample 
design in survey research. The six criteria are 1) a clearly specified population, 2) an explicitly stated 
unit of analysis, 3) a specification of determining a desired sample size, 4) an informative description 
of the selection procedures, 5) a description of response rate and nonrespondence treatment, and 6) 
demonstration of appropriate estimation procedures. For each criterion, discussion is focused on 
definitions, problems that are found in literature, and consequences of the problems. 

Introduction 

Survey research is widely used and reported in educational studies. Since a survey is seldom, 
if ever, administered to everybody in a group of interest (population) in educational studies, a survey 
is actually a sample survey, i.e., only a part of the group is usually involved. Researchers using the 
sample survey method hope to gain knowledge about the population with the results from the sample 
in hand. When such research results are reported, consumers of research (policy makers, practitioners, 
other researchers, etc.) expect to learn something they are interested in about that population. 

Because sample surveys work on partial information from a population, requirements have 
been established for conducting a sample survey to ensure the quality of information obtained and 
the validity of applying this information to the population. Of those various requirements, the ones 
concerning sample design deserve great attention. Typically, these sample design requirements involve 
sample selection and estimation procedures (Kish, 1965), which provides technical assurance to the 
validity of the survey findings for their intended use. 

Just as in other aspects of educational research, sample design in survey research is not free 
from problems. In fact, many educational survey studies have flaws in their sample design part 
(Miskel & Sandlin, 1981; Pena & Henderson, 1986; Permut, Michel, & Joseph, 1976; Wang, 1996; 
Wang & McNamara, 1997). Whatever the findings or conclusions drawn from those studies are prone 
to problems with regard to the validity issue. It is, therefore, very important that an informed 
consumer of such research be equipped with adequate knowledge and skills to evaluate the usefulness 
of the information from such research so as not to fall victim of blind belief in any misinformation. 
Recommendations about how to design a good sample survey are provided in many texts; these 
recommendations can also serve as general guidelines for evaluating survey research. Criteria 
specifically used in evaluating sample survey are, however, found in only a few studies; the criteria 
range from broad categories (Miskel & Sandlin, 1981; Permut et al, 1976) to detailed checklists 
(Jaeger, 1988; McNamara, 1994). 

In the evaluation work of both Miskel & Sandlin (1981) and Permut et al (1976), the broad 
criteria included population specification, unit of analysis, sampling frame, response rate, and selection 
procedures. These are very important criteria for sample design evaluation, especially those about 
population, unit of analysis and response rate. However, these criteria appear to be confined to the 
selection part of sample design only, whereas the estimation part, as defined by Kish (1965), is not 
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mentioned at all. Estimation is closely related to selection in a sample design. For example, both unit 
of analysis and response rate will have bearing on what analysis to be used in estimation; population 
specification will determine how the results will be interpreted. It is necessary to include estimation 
procedures in the evaluation criteria. 

The checklists from Jaeger (1988) and McNamara (1994) are both extensive, with 26 and 100 
items, respectively. The items cover a wide range of aspects of survey research, from research 
question and instrumentation to survey finding interpretation and recommendations. These checklists 
are valuable sources of information and guidance to researchers and people learning to conduct survey 
research. This is particularly true of McNamara’s list, which is so detailed that following the list will 
certainly gain in-depth knowledge about a sample design. The downside of these two checklists is, 
however, that the lists are too long and may become overwhelming to users in practice. Although the 
items in the checklists are important, some are more important than others with regard to their impact 
on the outcome of a survey study. Accordingly, from the viewpoint of a consumer of survey research 
findings, one needs to know the most important or crucial criteria that can be conveniently applied 
to practical use in evaluating survey research literature. 

It is helpful for survey research consumers to know what problems one may expect to find 
in a sample design and what are the possible consequences of such problems. For example, one 
should be able to tell from the description of a sample design whether the reported sample size may 
be problematic, and if it is, what impact this problem may have on the findings. Equipped with such 
knowledge, one may then make informed judgment about how to use the information from this survey 
research in an appropriate manner. Such a discussion is not directly available in the texts presenting 
the checklists for survey evaluation. 

The purpose of this paper is to provide a focused guide that highlights what we deem as key 
criteria, with emphasis on educational research applications. This focused guide considers six criteria 
that can be conveniently used by educational professionals (researchers, policy makers, administrators 
and teachers) to evaluate the quality of sample designs in educational surveys. These six criteria are: 

1. a clearly specified target population and survey population, 

2. an explicit statement of unit of analysis in light of the research questions, 

3. a specification of a desired sample size and how this is determined, 

4. an informative description of the selection procedures, 

5. a description of response rate with information about nonrespondence situation, and 

6. demonstration of appropriate estimation and data analysis according to the selection 
strategies used, including the treatment of possible nonresponse bias problem and 
generalization of findings. 

In presenting each criterion, three aspects will be discussed. First, a criterion will be described 
or defined within the sample survey context. Second, what problems might be encountered that are 
related to this criterion. Where necessary, examples drawn from an evaluation study by Wang (1996) 
will be used for illustration. Finally, possible consequences from not meeting the criterion will be 
explored. 
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The six criteria 



Criterion One: Population specification 

Population is very important in survey research because it is the characteristics of the population 
that we are interested in and are trying to figure out by conducting a sample survey. In survey 
research, a population is usually a finite population that “consists of a known, finite number N of 
units—such as people or plots of ground. With each unit is associated a value of a variable of interest” 
(Thompson, 1992, p. 2). A distinction is often made between a target population and a survey 
population because the two may not be the same in reality (Borg & Gall, 1989; Kish, 1965; Sudman, 
1976). 

A target population (also called the inference space in statistical theory) is the ideal population 
of interest, or the desired population, or “the total finite population about which we require 
information” (Barnett, 1991, p. 8). A target population must be defined in terms of content, units, 
extent, and time (Kish, 1965). By units it means whether a target population is a population of 
individuals, schools, households, etc. (Sudman, 1976). These units are also defined as sampling units, 
which can be individuals or aggregates such as schools or classes. For example, if we want to 
estimate the proportion of public school principals in the United States who hold graduate degrees, 
the target population then should consist of every single public school principal in the country. The 
sampling units are the individual principals. In practice, however, the target population is often 
difficult to access due to various constraints and has to be modified and reduced to a survey 
population. The importance of having a target population is that we know what has been excluded 
to obtain the survey population, and we can therefore assess the consequences of the exclusions in 
interpretation (Kalton, 1983b). 

A survey population (also called study population, experimentally accessible population, 
operational population, or sampled population) is the actual population studied. In the example above, 
the survey population may not include principals of schools located in some remote or thinly 
populated areas to reduce the cost of administering the survey. Namely, "resources and feasibility 
determine that some members of a target population have to be excluded " (Fink, 1995). The 
remaining members of the target population then constitute a survey population. The difference 
between the two types of population requires additional consideration in making inference and 
generalization. Since a sample is taken from a survey population, the statistical inference can be made 
only to the survey population. The generalization of results from the survey population to the target 
population is not based on any statistical decision but, instead, on substantive knowledge and 
subjective judgment (Babbie, 1986; Borg & Gall, 1989; Cochran, 1977; Jolliffe, 1986; Sudman, 1976). 
In other words, population validity has to be established to justify appropriate generalizations from 
a survey population to a target population. Any relevant differences between the two populations need 
to be accounted for. A simple way to make this point might be to use Jaeger's (1988) comment that 
a target population reflects a researcher's desires and a survey population (or a sampling frame) 
defines reality. 

Depending on the actual scope and nature of a survey study, the two types of population defined 
above can be the same. It is not impossible to sample directly from a target population. This can 
happen when a researcher conducts survey in a special population that can be controlled with some 
ease. For instance, when a researcher has to use a sample survey to obtain certain information from 
a population of full time school teachers in a district, the target population includes all the full time 
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teachers as defined by that district. This target population is typically limited in size and easily 
accessible; the sample can be taken from a sampling frame built from a roster of all the teachers in 
the district. Therefore, no difference exists between the target population and survey population. In 
many other situations, however, compromises or modifications have to be made between a survey 
population and a target population. 

It has been observed in educational research literature that many survey-based studies fail to 
report or describe adequately what populations are used. For example, in an evaluation of the survey 
studies published in a peer-reviewed journal in 1970s, Miskel and Sandlin (1981) reported that only 
seven of the 23 studies evaluated (30.4%) specified their populations. Wang (1996) examined the 
sample designs in the survey studies published by the same journal during 1980s and 1990s and found 
that only 1 1 of the 53 sample designs evaluated (20.8%) mentioned their populations. 

Where populations are not reported, even if they were defined in research, it is left for readers 
to figure out what populations may have been involved. The missing information about the population 
has at least two consequences. One is that it creates a problem for readers to have a clear idea about 
what population was actually studied. Nor is it easy for readers to evaluate whether a sample was 
good enough to be a fair representation of its population. The other consequence is that the scope of 
generalization of any findings becomes a questionable issue, since no particular population has been 
specified. Although the authors of such studies often discuss the implication or significance of their 
findings, which sounds like generalization, it is difficult to justify such a discussion in the absence 
of a clearly defined population. This problem will certainly affect the validity of a study no matter 
how well the other parts of the study proceed. 

Criterion Two: Unit of analysis specification 

The criterion for specifying unit of analysis in a sample design is necessary because this 
specification dictates outcomes of data analysis. Descriptions of unit of analysis can be found in many 
texts (Babbie, 1986; Borg & Gall, 1989; Fink, 1995). Babbie (1986) defined units of analysis as 
“those units that we initially describe for the ultimate purpose of aggregating their characteristics in 
order to describe some larger group or explain some abstract phenomenon” (p. 74). The importance 
of specifying units of analysis in social research is emphasized again and again in the same text, 
“When the unit of analysis is not so clear, however, it is absolutely essential to determine what it is; 
otherwise, you will be unable to determine what observations are to be made about whom or what” 
(Babbie, 1986, p. 76). This statement applies to many educational fields which tend to draw research 
methods from the social and behavioral sciences. 

Specifying units of analysis is done in the design stage according to the purposes of a study, 
which is to investigate some phenomena at either the individual or the group level. Borg and Gall 
(1989) has made it clear that, in education, activities occur at different levels: individual students, 
groups of special interest, classroom, school, etc.. Researchers need to decide which of these levels 
contains the phenomena of interest to them, and to develop a theory or explanation for why they are 
interested in these phenomena. 

For example, if a researcher has a good reason to study teacher's job satisfaction, individual 
teachers should be the unit of analysis even though schools may have to be. selected first before 
teachers can be sampled from the schools. The schools are not the units of analysis because they do 
not constitute the interest of the study. In another example, if a researcher plans to investigate what 
factors are related to school effectiveness, the unit of analysis is school, not individuals like 
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principals, teachers, or students, even though raw data must be collected from individuals working 
at a school. The individual data is usually aggregated to the level (school) as defined by the unit of 
analysis before further analysis is conducted. 

Unit of analysis is often overlooked by some researchers in survey studies they fail to specify 
their units of analysis. Again, in Miskel and Sandlin’s (1981) report, they only found seven of 23 
studies having information about what the units of analysis were used. In Wang’s (1996) evaluation, 
only 23 of 53 sample designs had such information. In other words, still more than half of the studies 
did not say what units of analysis had been considered in their designs and analysis. 

Failure to specify unit of analysis may not lead to serious problems in a simple study that only 
involves data from one level; the correspondence between research questions and unit of analysis is 
sometimes apparent even if nothing is said about the latter. In a more extended study that involves 
sampling at several levels, however, problems may arise if unit of analysis is not given due attention. 
A typical problem a researcher may have is to do the analysis on a wrong level. 

For example, when we study a sample of 20 classes with five students from each class of 30 to 
investigate students’ expectation of their teachers, class is the right unit of analysis and the data from 
five students should be aggregated to yield a data point for their class. In the subsequent analysis, the 
sample size is 20, the number of classes used. If the analysis is conducted using individual students 
as unit of analysis, the sample size would be 100. The findings would be distorted and not appropriate 
for the interest of research. Because of the exaggerated sample size and the inflated statistical power, 
trivial differences among the classes may now become statistically significant. Any conclusions based 
on this result fall victim to the misuse of the unit of analysis in the study. 

Criterion Three: Sample size determination 

Another important task in sample design is determining an appropriate sample size and this 
information should be reported. Sample size decision is primarily based on desired analysis and 
efficiency. The analysis refers to parameter estimation and hypothesis testing, the two fundamental 
components of statistical analysis. Sample size plays an important role in both cases in that sample 
size influences the margin of error, or degree of precision in estimation, and affects statistical power 
in hypothesis testing (Cohen, 1988; Fink, 1995; McNamara, 1994; Sheaffer, Mendenhall, & Ott, 
1990). Efficiency is about obtaining the same amount of information with as small a sample as 
possible. As Jaeger (1988) put it, “Choosing the best possible sampling method is one important way 
of increasing the efficiency of a survey, thus reducing costs without sacrificing quality or precision” 
(p. 317). 

To illustrate this point, Jaeger (1988) gave an example that, to estimate the average score of 1200 
students in a school system, the most efficient procedure would require testing only 25 students, 
whereas the least efficient sampling and estimation procedures would have to test 1,041 students. 
Methods for calculating desired sample sizes can be found in many statistics and research method 
books, among which an introductory sampling text by Sheaffer et al. (1990), a statistical power 
handbook by Cohen (1988), and a text by McNamara (1994) appear to be fairly comprehensive and 
easy to use. 

Specifying sample size decisions appears to be another weak area in that survey studies in 
educational literature tend to give only the achieved sample size used in analysis but provide little 
information about how a planned sample size is determined. For example, in the 53 sample designs 
evaluated by Wang (1996), only two mentioned their sample size decisions. The problem associated 
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with not reporting the planned sample size in a survey study makes it difficult for other researchers 
and readers to evaluate the usefulness of the findings reported in the study. Even if the achieved 
sample size is good enough to justify the results of statistical analyses, there still remains a question 
about how well the findings are generalizable. 

For instance, a sample size of 100 may be adequate for a hypothesis testing or an estimation of 
the population mean. There is a big difference, however, as to whether this sample size of 100 is from 
a design requesting 120 people or 200 people. Such knowledge is definitely important to anyone who 
is interested in the results of your study. This also gets into response rate issue to be discussed later 
in this paper. 

Another observation here is that one needs to be aware of what sampling scheme to use when 
determining the sample size in a survey study. The sample size tables that one typically finds in a 
reference text, such as the one list in McNamara’s book (1994, p. 3), usually assume a simple random 
sampling scheme. In practice, true simple random samples are not often used, other schemes such as 
stratified or clustered sampling methods are frequently employed instead. When these complex 
sampling methods, meaning methods other than the simple random sampling, are involved, it is not 
advised to use a sample size table mentioned above to select a sample size. Depending on the actual 
sampling method, different formulas are needed for sample size calculation. 

Criterion Four: Selection procedure description 

A clear description of the sample selection procedures implemented is an integrated component 
of sample design in a research report. It is a common practice, in reporting quantitative research 
findings, to present detailed information about how the research, say an experiment, is conducted at 
every step. Such information serves two purposes. One is to let other researchers have sufficient 
information to evaluate the research and findings. The other is to provide adequate information to 
some experienced researchers who may want to replicate the research under comparable conditions 
(American Psychological Association, 1995). 

In large-scale national surveys conducted by survey specialists, sample selection procedures are 
usually described in great detail in technical reports. In academic publications, journal space limit 
does not allow presenting every single technical detail in an article. However, the space limit does 
not mean no disclosure at all about how a sample is actually selected. This is why a method section 
is required for a quantitative research report In fact, for a description of selection procedures, a few 
lines will be enough if a sample design is not very complex, which is usually the case in many 
educational survey studies. 

Lack of an adequate description of sample selection procedures is frequently found in survey 
research publications. One may find such things as achieved sample size, demographics, and the type 
of sampling scheme in the methods section in a report. Little, however, is said about how the sample 
is selected from a population. In the 53 sample designs reviewed (Wang, 1996), about 10 appeared 
to describe their selection procedures with adequacy, the rest did not. The fact that 43 of the 53 
sample designs did not provide adequate information about their selection procedures suggests a rather 
popular yet unjustified practice in reporting survey research findings. This practice is also found 
elsewhere (Pena & Henderson, 1986). It is surprising that sample selection procedures were treated 
so lightly in so many articles that were published to share research experience and results and to add 
to the knowledge base of the field. 
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Without enough information about the selection procedures in obtaining a sample, it is hard to 
evaluate the usefulness of the sample in estimating a population characteristic. A major problem lies 
in the fact that there is no way to assess whether and how selection bias may affect the nature of the 
achieved sample. For example, selecting a true simple random sample assumes that a researcher has 
access to every member in the population and everyone is willing to cooperate. This seldom happens 
in reality. In a large scale survey that may involve several school districts or states, things become 
even more complicated and there has to be various compromises between the desired and achieved 
samples. When this occurs but no description is available about the actual sample selection process, 
how can other researchers know what happened in between and to judge the value of the data and 
the findings of the study? Where a really good survey study is designed and carried out, lack of the 
sample selection procedures makes it not easy for others to learn from the study. This is a waste of 
talent and knowledge. 

Criterion Five: Response rate and nonresponse treatment 



A big problem that worries survey researchers is the level of participation by sampled subjects. 
The participation level is measured by response rate. Although nonresponse and its effects seem to 
be the concern of estimation, data analysis and generalization, the problem should be considered at 
every phase of sample design, from population definition, sample size decisions, to selection, 
estimation and interpretation. Information about nonresponse rates and nonrespondents' characteristics 
should also be disclosed in a research report, as is emphasized in survey research literature (Aiken, 
1988; Borg & Gall, 1989; Jaeger, 1988; Shultz & Luloff, 1990; West, 1991). 

The nonresponse problem has to be handled in both sample selection and estimation procedures 
of sample design. A researcher should always strive for high response rates when selecting a sample, 
which also means to minimize nonresponse rates. The nonresponse rate itself is not as simple as one 
might think. Although people tend to think of nonresponse rate in terms of unit nonresponse rate for 
the whole sample, nonresponse also occurs, and should be investigated, at the level of item 
nonresponse or subgroup nonresponse rates. For example, if everyone in a sample responded to a 
survey to yield a perfect response rate, then it was found that some items in the completed 
questionnaires were left blank for some reason (say, most of the items asked sensitive questions which 
few subjects wanted to give information about). This item nonresponse would pose a dilemma for the 
researchers in this case. 

On the other hand, suppose a survey was designed to study compensation inequality between 
male and female administrators, and a stratified sample both genders is selected. The data showed that 
the overall response rate was high (say 80%). Further examination, however, found that one gender 
group had a near perfect return rate while the other had a low response rate. The difference in the 
subgroup nonresponse rates would also be a threat to the validity of the findings of this study, if this 
problem was not properly addressed. In a word, the nonresponse rate in a research report is a crucial 
piece of information with which the quality of data and validity of findings can be evaluated. This 
explains why such information needs to be made available to other researchers. 

A related issue is the treatment of nonresponse if nonresponse is high and becomes a problem. 
A comparison between nonrespondents and respondents on variables of interest is an important step 
to know one's sample and data quality. This information is also of interest to other researchers doing 
similar investigations, and should be disclosed if available. It is helpful to remember, however, a 
detailed sample description is not the same as a proper treatment of nonresponse. Sample description 
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has more to do with population coverage in that such a description may show how comparable the 
sample and its population are on variables of interest. Investigation of nonrespondent characteristics, 
on the other hand, addresses possible bias problems resulting from refusal to respond for certain 
reasons. It should be stressed that investigation of nonrespondents' characteristics is always necessary 
(Aiken, 1988) and can not be spared. Generally speaking, nonresponse problem should be addressed 
in both selection process and in data analysis. In selection, aggressive follow-up efforts need to be 
made to encourage or persuade people to participate. In data analysis, adjustments can be made to 
reduce the influence of the nonresponse. 

The information from the 53 sample designs reviewed indicates that most of the studies did 
mention the response rates achieved, but few discussed how nonresponse issue was handled. On the 
one hand, disclosing response rate is always a good practice as it gives readers an important piece 
of information about the quality of the sample and the study. On the other hand, it is not acceptable 
not to investigate nonresponse characteristics. In some of the studies evaluated, the response rate was 
rather low (50% or lower). Without any explanation about the nature of nonresponse and 
nonrespondents, the survey findings are open to questions: some response bias is likely to be present 
in the data from the achieved sample but the bias is not explored. A possible consequence from this 
situation is that the findings may not applied to the originally-planned sample, let alone the population 
which the sample is supposed to represent. 

Criterion Six: Estimation procedures 



Estimation procedures in a survey design may include such aspects as weighting, variance 
estimation, design effect, confidence interval estimation and treatment of nonresponse bias. That 
estimation is considered a criterion in sample design evaluation is because estimation procedures are 
closely related to the selection procedures and the data is the outcome of the sample selection. 

Weighting . This is an important strategy used in survey data analysis to adjust for unequal 
selection probabilities that result from complex sample designs; it is also used in poststratification, 
and in making adjustment for total nonresponse (Kalton, 1983a, 1983b; Lee, Forthofer, & Lorimor, 
1989). One typical instance of unequal selection probability occurs in stratified sampling where, 
intentional oversampling of a particular stratum is desired to secure enough cases from the stratum 
for a reliable estimation. This happens when both total population and subpopulation (stratum) 
parameter estimates are needed, but a subpopulation may hold only a small proportion of the total 
population. 

Suppose we want to estimate teacher's job satisfaction in a school district of 1,000 teachers and 
we are also interested in the responses of the minority teachers in the district. We know that the latter 
group contains only six percent of the teacher population in the district. With an ideal simple random 
sample of 100 from the population, we will end up with about six minority teachers and 94 non- 
minority ones. Although 100 teachers may satisfy one’s need to estimate the characteristics of the 
teacher’s population as a whole, the group size of six is too small to tell about the minority teachers 
as a subpopulation. Therefore, we need to stratify the population by designating this particular 
minority group as a stratum and over-sample this group. For example, we may sample 30 minority 
teachers from this stratum of 60, the selection probability is one out of two (30 from 60), whereas 
the selection probability for the rest of the population is about .074 (70 from 960). When estimating 
the whole teacher population, each minority teacher is given a weight of 0.2 (6/30) and each of the 
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rest has a weight of 1.34 (94/70). Separate analyses can then be done on the 30 minority teachers to 
explore the characteristics of this group. 

Complex variance . When cluster sampling method is used in design and selection, variance 
estimation method assuming a simple random sample is not appropriate. Simply put, a cluster sample 
of size n tends to yield a larger variance than does a simple random sample of the same size. This 
is mainly due to what Kish (1965) called the intraclass correlation among the responses from a 
particular cluster (say, students from a class). The effect of a cluster sampling design on variance is 
measured by design effect, which is the ratio of the actual cluster sample variance estimate to the 
simple random sample variance estimate for a given same sample size n. If a cluster sample is 
obtained, but the analysis is performed using the method for simple random sample data, the result 
is usually an underestimation of the actual variance. In practice, the complex variance is usually 
estimated by approximation using a variety of resampling strategies (Lee et al., 1989). 

Nonresponse adjustment Possible estimation bias due to nonresponse needs investigation. A 
survey population can be practically viewed as consisting two subpopulations, a respondent population 
and a nonrespondent population. The decision to become a nonrespondent or a respondent is seldom 
random. Therefore, there can be distinct differences between the two populations on variables of 
interest The purpose of survey is to obtain accurate information about the whole population, whose 
quantity can be expressed as the sum of the weighted values from the two populations: 

Q.= W&l + W2& 

where Q is the population value of interest, £, is the known sample response value, is the unknown 
nonresponse value, and Wj, W 2 are the weights for the percentage of population of the respondent and 
nonrespondent subpopulations respectively. The estimation of Q depends on both W 2 and Without 
a direct investigation, ^ is always unknown. The best way to ensure a reasonable Q is to keep W 2 as 
small as possible to minimize the impact of the nonrespondent population on the whole population. 

Assessment of nonresponse effects and possible adjustments are to find out what ^ would be 
if the nonrespondents had responded. Because no one ever knows whatqj is, we have to rely on some 
reasonable approximation. A host of techniques are available in sample survey literature. Typically, 
respondents and nonrespondents are compared on some ancillary information that is thought to be 
related to survey purposes and may influence one's response to survey questions. Various weighting 
and imputation methods are also proposed to compensate for both unit nonresponse and item 
nonresponse. 

In educational survey studies, complex sample designs of one kind or another are frequently 
seen in literature. These designs include not only cluster sampling but also multi-stage sampling that 
may involve disproportional selections. However, few studies of such designs show any indications 
that such things as weighting, complex variance estimation ,or nonresponse adjustments were ever 
used or considered. Typically, a report describes sample, data collection method and analyses 
conducted in a way that appears that only a strictly simple random sample is used. The data analyses 
are usually done with some commercial statistical computing packages such as SAS or SPSS, where 
computation is based on the assumption of using simple random sample data. When the data from 
a complex sample design is submitted to such a statistical package, the variance estimates are 
underestimated. Consequently, in hypothesis testing, the statistical power is falsely enhanced and a 
statistically significant result is more likely to occur at a given error rate. In parameter estimation, the 
underestimation of variance would lead to the construction of unduly narrow bounds around the point 
estimates, giving the false impression of great precision in estimation. All these are the problematic 
areas to look for when reading a survey research report 
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Summary 

When it comes to evaluate the quality of survey study findings, one needs to examine the 
sample design part with great care because the design determines the sample selection and estimation 
procedures. There are a lot of things to look for in a sample design, depending on what information 
one is seeking. In a practical sense, however, six criteria were deemed most important for consumers 
of survey research to keep in mind when reading a report of survey research. 

The six key criteria covered both selection and estimation in general. In particular, a good 
survey sample design should have information about a clearly specified population of interest. This 
information relates to the validity and generalizability of the findings of the study. 

The unit of analysis should be explicitly stated in the design so that the relevance of the 
chosen unit of analysis to the research questions of the survey study can be checked. It is always 
helpful to ask whether the right unit of analysis is used in analysis for the given purposes of a study. 

Telling the achieved sample size alone is not enough. The planned sample size and how the 
size is determined need to be explained in a report. Without this information, readers are unable to 
know why a certain sample size is chosen over other possibilities or whether such aspects as statistical 
power, margin of error, design effect, etc. have ever been taken into consideration. 

The procedures with which a sample is actually selected help readers not only evaluate the 
quality of the sample, but also plan for similar sample selection in their own survey studies. 

Nonresponse causes headache to survey researchers and has to be dealt with with great effort. 
Treatment of nonresponse can be done with good follow-up efforts to increase response rate in the 
selection stage. In data analysis stage, necessary adjustments can be made to reduce the impact of the 
nonresponse. Such information should be made available to readers so that they can better understand 
the meanings of the survey results as reported. 

Finally, several things call for consideration in estimation. These are weighting, complex 
variance estimation, and nonresponse adjustment. These procedures are often needed because true 
simple random samples are seldom used, and nonresponse of different degrees is almost a reality in 
most survey studies. Without well planned estimation procedures, results from data analysis can be 
affected. It is very important to know how the statistics or population estimates are obtained from data 
and, given this knowledge, how justifiable and useful the findings may be. 




12 



* 



Survey Sample Design Criteria 12 



Reference 

Aiken, L. R. (1988). The problem of nonresponse in survey research. Journal of Experimental 
Education. 56 (3), 116-119. 

American Psychological Association. (1995). Publication manual of the American Psychological 
Association (4th ed.). Washington, DC: Author. 

Babbie, E. (1986). The practice of social research (4th ed.). Belmont, CA: Wadsworth. 

Babbie, E. (1990). Survey research methods (2nd ed.). Belmont, CA: Wadsworth. 

Barnett, V. (1991). Sample survey principles and methods . New York: Oxford University Press. 

Borg, W. R., Gall, M. D. (1989). Educational research (5th ed.). New York: Longman. 

Cochran, W. G. (1977). Sampling techniques . New York: John Wiley & Sons. 

Cohen, J. (1988). Statistical power for the behavioral sciences (2nd ed.). Hillsdale, NJ: Lawrence 
Erlbaum. 

Fink, A. (1995). How to sample in surveys . Thousand Oaks, CA: Sage. 

Jaeger, R. M. (1988). Survey research methods in education. In R. M. Jaeger (Ed.), Complementary 
methods for research in education (pp. 303-330). Washington, DC: American Educational 
Research Association. 

Jolliffe, F. R. (1986). Survey design and analysis . New York: John Wiley & Sons. 

Kalton, G. (1983a). Compensating for missing survey data . Ann Arbor, MI: Institute for Social 
Research, University of Michigan. 

Kalton, G. (1983b). Introduction to survey sampling . Beverly Hills, CA: Sage. 

Kish, L. (1965). Survey sampling . New York: John Wiley & Sons. 

Lee, E. S., Forthofer, R. N., & Lorimor, R. J. (1989). Analyzing complex survey data . Newbury Park, 
CA: Sage. 

McNamara, J. F. (1994). Surveys and experiments in education research . Lancaster, PA: Technomic. 

Miskel, C. & Sandlin, T. (1981). Survey research in educational administration. Educational 
Administration Quarterly, 17 (4), 1-20. 

Pena, D. M. & Henderson, R. D. (1986). Sampling procedures used for national surveys of public 
school teachers — problems and possible solutions . Paper presented at the annual meeting of 
the American Educational Research Association, San Francisco, CA. 

Permut, J. E., Michel, A. J., & Joseph, M. (1976). The researcher's sample: A review of the choice 
of respondents in marketing research. Journal of Marketing Research, 13, 278-283. 

Scheaffer, R. L., Mendenhall, W., & Ott, L. (1990). Elementary survey sampling (4th ed.). Boston, 
MA: PWS-Kent. 

Shultz, S. D., & Luloff, A. E. (1990). The threat of nonresponse bias to survey research. Journal of 
the Community Development Society, 21(2), 104-115. 

Sudman, S. (1976). Applied sampling . New York: Academic Press. 

Thompson, S. K. (1992). Sampling . New York: Wiley. 

Wang, L. (1996). A typology and evaluation of the survey sample designs in the Educational 

Administration Quarterly: 1980-1995 . Unpublished doctoral dissertation, Texas A&M University, 

College Station. 

Wang, L., & McNamara, F. J. (1997). An Evaluation of the Sample Designs in Educational Survey 
Research . Paper presented at the 1997 annual meeting of the American Educational Research 
Association, Chicago, IL, March, 1997. 

West, L. J. (1991). A frequently violated requirement in survey research. Journal of Education for 
Business, 66 (3), 134-135. 




13 



® 




U.S. Department of Education 

Office of Educational Research and Improvement (OERI) 
National Library of Education (NLE) 

Educational Resources Information Center (ERIC) 

REPRODUCTION RELEASE 

(Specific Document) 




TM029098 



I. DOCUMENT IDENTIFICATION: 



T,tle: <TiK ocirm^ && s/wpte £[/aluapoa/ 



Author(s): /./^V K/AA/^ . 


XITAO 


FA hi 




Corporate Source: 

Act 


um 


STATE Udii/eXS/Ty 


Publication Date: 



II. REPRODUCTION RELEASE: 

In order to disseminate as widely as possible timely and significant materials of interest to the educational community, documents announced In the 
monthly abstract journal of the ERIC system, Resources in Education (RIE), are usually made available to users in microfiche, reproduced paper copy, 
and electronic media, and sold through the ERIC Document Reproduction Service (EDRS). Credit is given to the source of each document, and, if 
reproduction release is granted, one of the following notices is affixed to the document. 

If permission is granted to reproduce and disseminate the identified document, please CHECK ONE of the following three options and sign at the bottom 
of the page. 



The aampto sticker shown below will be 
affixed to all Level 1 documents 



The sample sticker shown below will be 
affixed to ail Level 2A documents 



The sample sticker shown below wlU be 
affixed to all Level 2B documents 



PERMISSION TO REPRODUCE AND 
DISSEMINATE THIS MATERIAL HAS 
BEEN GRANTED BY 

r <f 




' PERMISSION TO REPRODUCE AND 
DISSEMINATE THIS MATERIAL IN 
MICROFICHE, AND IN ELECTRONIC MEDIA 
FOR ERIC COLLECTION SUBSCRIBERS ONLY, 
HAS BEEN GRANTED BY 

a® 




PERMISSION TO REPRODUCE AND 
DISSEMINATE THIS MATERIAL IN 
MICROFICHE ONLY HAS BEEN GRANTED BY 


<= r 




c/ 




c3> 


TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC) 




TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC) 




J 

TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC) 


1 




2A 




2B 



Level 1 



Level 2A 



Level 2B 



t 



t 






Check here for Level 1 release, permitting reproduction 
and dissemination In microfiche or other ERIC archival 
media (e.g., electronic) and paper copy. 



Check here for Level 2A release, permitting reproduction 
and dissemination in microfiche and In electronic media 
for ERIC archival collection subscribers only 



Check here for Level 2B release, permitting 
reproduction and dissemination In microfiche onfy 



Documents will be processed as Indicated provided reproduction quality permits. 

If permission to reproduce Is granted, but no box Is checked, documents will be processed at Lavel 1 . 



Sign 
here,-* 
please 



O 

ERIC 



/ hereby grant to the Educational Resources information Center (ERIC) nonexclusive permission to reproduce and disseminate this document 
as indicated above. Reproduction from the ERIC microfiche or electronic media by persons other than ERIC employees and its system 
contractors requires permission from the copyright holder. Exception is made for non-profit reproduction by libraries and other service agencies 
to satisfy information needs of educators in response to discrete inquiries. 


Signature: If N / / 


Printed Name/Position/TfUe: 

L/aJ 


HCM A<iooi4r£ 


Organization/ Addreii! \ 

MmdteiLte Act 

2 (V- . 




FAX 


E4U "^ “23W42 act. 0*4 





(over) 



Clearinghouse on Assessment and Evaluation 



University of Maryland 
1129 Shriver Laboratory 
College Park, MD 20742-5701 

Tel: (800) 464-3742 
(301)405-7449 
FAX: (301) 405-8134 

March 20, 1998 ericae@ericae.net 

http://ericae.net 

Dear AERA Presenter, 

Congratulations on being a presenter at AERA 1 . The ERIC Clearinghouse on Assessment and Evaluation 
invites you to contribute to the ERIC database by providing us with a printed copy of your presentation. 

Abstracts of papers accepted by ERIC appear in Resources in Education (R1E) and are announced to over 
5,000 organizations. The inclusion of your work makes it readily available to other researchers, provides a 
permanent archive, and enhances the quality of R1E. Abstracts of your contribution will be accessible 
through the printed and electronic versions of R1E. The paper will be available through the microfiche 
collections that are housed at libraries around the world and through the ERIC Document Reproduction 
Service. 

We are gathering all the papers from the AERA Conference. We will route your paper to the appropriate 
clearinghouse. You will be notified if your paper meets ERIC's criteria for inclusion in R1E: contribution 
to education, timeliness, relevance, methodology, effectiveness of presentation, and reproduction quality. 

You can track our processing of your paper at http://ericae.net. 

Please sign the Reproduction Release Form on the back of this letter and include it with two copies of your 
paper. The Release Form gives ERIC permission to make and distribute copies of your paper. It does not 
preclude you from publishing your work. You can drop off the copies of your paper and Reproduction 
Release Form at the ERIC booth (424) or mail to our attention at the address below. Please feel free to 
copy the form for future or additional submissions. 

Mail to: AERA 1998/ERIC Acquisitions 

University of Maryland 
1 129 Shriver Laboratory 
College Park, MD 20742 

This year ERIC/AE is making a Searchable Conference Program available on the AERA web page 
(http://aera.net). Check it out! 




Sincerely, 




Lawrence M. Rudner, Ph.D. 
Director, ERIC/AE 




‘If you are an AERA chair or discussant, please save this form for future use. 



CUA 




The Catholic University of America 




